Neural Network Learning: Theoretical Foundations -- A Review
نویسنده
چکیده
ence exists between parametric and nonparametric statistical tests. Parametric tests are only valid if the data satisfy certain assumptions. If these assumptions hold, they will, however, typically give more accurate results. The analysis of statistical learning theory has very much the flavor of a nonparametric statistical test. Almost no The scientific method aims to derive mathematical models that help us to understand and exploit phenomena, whether they be natural or human made. Machine learning, and more particularly learning with neural networks, can be viewed as just such a phenomenon. Frequently remarkable performance is obtained by training networks to perform relatively complex AI tasks. Despite this success, most practitioners would readily admit that they are far from fully understanding why and, more importantly, when the techniques can be expected to be effective. The need for a fuller theoretical analysis and understanding of their performance has been a major research objective for the last decade. Neural Network Learning: Theoretical Foundations reports on important developments that have been made toward this goal within the computational learning theory framework. Results from computational learning theory typically make fewer assumptions and, therefore, stronger statements than, for example, a Bayesian analysis. This generality can be both a strength and a weakness. Its strength is in the general applicability of the results. However, its weakness follows because a more general result must be more pessimistic to still hold true in the worst case. A similar differbility or significance) approximately correct (that is the generalization error is low), or pac. The weakness of pac, therefore, is that its results must hold true even in worst-case distributions. There is, however, a new twist to this story in that the more recent pacstyle results are able to take account of observed attributes of the function that has been chosen by the learner, for example, its margin on the training set. Such attributes measure how beneficial the particular distribution is and feed directly into the bound on the generalization, hence helping to motivate learning strategies that attempt to minimize the particular bound, for example, by maximizing the margin. For this reason, the new style of analysis is often referred to as data dependent. Bartlett and Anthony have been two of the researchers driving these developments and, hence, are particularly well placed to produce a book, one of whose main goals is to show the reader how these new results affect neural network learning. The first part of the book looks at classification using binary-output neural networks. The approach presented is the “classical” (non–data-dependent) pac learning analysis of binary classifiers based on the Vapnik-Chervonenkis dimension and associated growth function. A thorough coverage is given of these results, including proofs of all the main theorems. An alternative proof of Sauer’s lemma owed to Steele (1978) is given, and a detailed description is included of the crucial symmetrization lemma that forms the core of the Vapnik-Chervonenkis theorem. It is clear by this point that the book aims to give a comprehensive account not only of the results but also of their detailed proofs. This thoroughassumptions are made about the distribution generating the data. In addition, its bounds hold with high probability in the same way that significance in a statistical test indicates the probability that the data have misled you into accepting a particular hypothesis. For this reason, computational learning theory results are often referred to as probably (that is, with high probaNeural Network Learning: Theoretical Foundations, Martin Anthony and Peter L. Bartlett, Cambridge University Press, Cambridge, U.K., 1999, 389 pp., ISBN 0-521-57353X, $59.95 (hardcover).
منابع مشابه
EECS 598 - 005 : Theoretical Foundations of Machine Learning Fall 2015 Lecture 15 : Neural Networks Theory
متن کامل
Computational approaches to motor control.
This review will focus on four areas of motor control which have recently been enriched both by neural network and control system models: motor planning, motor prediction, state estimation and motor learning. We will review the computational foundations of each of these concepts and present specific models which have been tested by psychophysical experiments. We will cover the topics of optimal...
متن کاملTutorial: Perspectives on Learning with RNNs
We present an overview of current lines of research on learning with recurrent neural networks (RNNs). Topics covered are: understanding and unification of algorithms, theoretical foundations, new efforts to circumvent gradient vanishing, new architectures, and fusion with other learning methods and dynamical systems theory. The structuring guideline is to understand many new approaches as diff...
متن کاملLearning without local minima in radial basis function networks
Learning from examples plays a central role in artificial neural networks. The success of many learning schemes is not guaranteed, however, since algorithms like backpropagation may get stuck in local minima, thus providing suboptimal solutions. For feedforward networks, optimal learning can be achieved provided that certain conditions on the network and the learning environment are met. This p...
متن کاملModeling language and cognition with deep unsupervised learning: a tutorial overview
Deep unsupervised learning in stochastic recurrent neural networks with many layers of hidden units is a recent breakthrough in neural computation research. These networks build a hierarchy of progressively more complex distributed representations of the sensory data by fitting a hierarchical generative model. In this article we discuss the theoretical foundations of this approach and we review...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001